IN THE CLAIMS 



1 . (Previously Presented) An apparatus, in an integrated circuit (IC) of a data processing 
system having at least one host processor and host memory, comprising: 
a chip interconnect; 

a host interface coupled to the chip interconnect for interfacing the IC with the at least 
one host processor external to the IC; 

a memory interface coupled to the chip interconnect for accessing a memory external 
to the IC, the memory interface including a non-coherent interface for 
interfacing the IC with the host memory external to the IC, the memory 
interface including a coherent interface for interfacing the IC with a cache 
memory external to the IC via the at least one host processor; 

a memory controller coupled to the chip interconnect for controlling the host memory 
comprising DRAM memory via the memory interface, the memory controller 
to determine whether to access the memory through the coherent interface or 
the non-coherent interface; 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 
executing instructions to perform scalar data processing; 

a vector processing unit coupled to the chip interconnect, the vector processing unit 
executing instructions to perform vector data processing; and 

an input and output (I/O) interface coupled to the chip interconnect for interfacing the 
IC with an I/O controller of the data processing system, the I/O controller being 
external to the IC for controlling I/O devices of the data processing system, 
wherein the chip interconnect, the memory controller, the scalar processing 
unit, the vector processing unit, the I/O interface, the host interface, and the 
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memory interface are implemented within the IC which is a single chipset 
interfacing the at least one host processor and the host memory with other 
components of the data processing system, including the I/O controller and the 
I/O devices. 

2. (Previously Presented) The apparatus of claim 1, further comprising a switch mechanism 
coupled the chip interconnect and coupled to the scalar processing unit and coupled to the 
vector processing unit, the switch mechanism operable to receive multiple media data streams 
from the I/O interface and dispatch the multiple media data streams to the scalar processing 
unit and/or the vector processing unit. 

3. (Previously Presented) The apparatus of claim 1, further comprising: 

multiple scalar processing units, the multiple scalar processing units executing 

instructions to perform scalar processing substantially simultaneously; and 

multiple vector processing units, the multiple vector processing units executing 
instructions to perform vector processing substantially simultaneously. 

4. (Original) The apparatus of claim 3, further comprising multiple scalar processing 
units of a kind and multiple vector processing unit of a kind. 

5. (Previously Presented) The apparatus of claim 1, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; and 

a load and store unit (LSU), the LSU executing instructions to load and store scalar 

data from and to the GPR, and the LSU executing instructions to load and store 

vector data from and to the VR. 
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6. (Original) The apparatus of claim 5, further comprising a memory location coupled to 
the chip interconnect, wherein the LSU loads and stores data from and to the memory 
location. 

7. (Original) The apparatus of claim 6, further comprising a direct memory access (DMA) 
engine, the DMA engine transferring the multiple media data between the memory location 
and the host memory. 

8. (Original) The apparatus of claim 5, wherein the LSU is capable of executing 
instructions to load and store various formats of scalar and vector data, wherein the various 
formats comprise 8-bit, 16-bit, and 32-bit formats. 

9. (Previously Presented) The apparatus of claim 1, wherein the switch mechanism 
comprises an instruction unit (IUNIT), the IUNIT controlling and dispatching instructions 
substantially simultaneously. 

10. (Original) The apparatus of claim 9, wherein the instructions comprise very long 
instruction word (VLIW) instructions. 

1 1 . (Original) The apparatus of claim 9, wherein the IUNIT further comprises: 

a program counter; 

a branch unit, wherein the program counter and the branch unit determine the location 

to fetch next instructions; 
an instruction cache memory, the instruction cache memory comprising instruction 

cache tag and data memories for buffering instructions transmitted from the 

host memory; and 
at least one memory mapped registers accessible by the host. 
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12. (Previously Presented) The apparatus of claim 1, wherein the scalar processing unit 
comprises: 

an integer arithmetic and logic unit (IALU), the IALU executing instructions to 
perform simple scalar integer arithmetic and logical operations; and 

an integer shift unit (ISHU), the ISHU executing instructions to perform scalar bit 
shifting and rotating operations; 

13. (Previously Presented) The apparatus of claim 12, wherein the scalar processing unit 
further comprises a floating point unit (FPU), the FPU executing instructions to perform high 
precision scalar data processing. 

14. (Previously Presented) The apparatus of claim 1, wherein the vector processing unit 
comprises: 

a vector permute unit (VPU), the VPU executing instructions to perform vector 
permute operations; 

a vector simple integer unit (VSIU), the VSIU executing instructions to perform vector 

simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU executing instructions to perform 

vector complex integer arithmetic operations; and 
a vector look-up table unit (VLUT), the VLUT executing instructions to perform at 

least one vector table look-up. 

15. (Previously Presented) The apparatus of claim 14, wherein the vector processing unit 
further comprises a vector floating point unit (VFPU), the VFPU executing instructions 
to perform high precision vector data processing. 
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16. (Original) The apparatus of claim 14, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 

1 7. (Original) The apparatus of claim 1 6, wherein data of the LUT are transferred from the 
host memory to the memory location through a direct memory access (DMA) operation. 

1 8. (Original) The apparatus of claim 16, wherein the memory location comprises a static 
random access memory (SRAM). 

19. (Original) The apparatus of claim 1, wherein the scalar and vector processing units are 
capable of performing data processing autonomously and asynchronously to the host 
processor. 

20. (Original) The apparatus of claim 1, wherein the scalar and vector processing units 
communicate with the host processing through an interrupt mechanism. 

21 . (Original) The apparatus of claim 1, wherein the scalar and vector processing units are 
accessible by the host processor, through a set of memory mapped addresses. 

22. (Original) The apparatus of claim 1, wherein the IC may be a co-processor to the host, 
wherein the IC may be a stand-alone processor coupled to a bus of the data processing 
system, and wherein the chipset may be a core logic chip having a host interface coupled 
to the host processor and memory interface coupled to the host memory. 

23. (Original) The apparatus of claim 5, further comprises a special purpose register (SPR) 
file coupled to the chip interconnect. 
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24. (Previously Presented) A method, in an integrated circuit (IC) having a chip 
interconnect, of a data processing system having at least one host processor and a host 
memory, the method comprising: 

receiving a data stream from an input/output (I/O) interface coupled to the chip 

interconnect, the I/O interface capable of being coupled to an I/O controller 
external to the IC, the I/O controller controlling I/O devices of the data 
processing system external to the IC, the I/O interface coupled with a memory 
interface for accessing a memory external to the IC, the memory interface 
including a non-coherent interface for interfacing the IC with the host memory, 
the memory interface including a coherent interface for interfacing the IC with 
a cache memory via the at least one host processor, the memory interface 
coupled with a memory controller to determine whether to access the memory 
through the coherent interface or the non-coherent interface; 
examining data of the data stream to determine whether the data requires scalar data 

processing or vector data processing; 
performing scalar data processing on the data in the IC, if the data requires scalar data 
processing; and 

performing vector data processing on the data in the IC, if the data requires vector data 
processing, wherein receiving the data stream, examining the data, the scalar 
data processing, and the vector data processing are performed within the IC 
which is a single chipset interfacing the at least one host processor and the host 
memory with other components of the data processing system including the I/O 
controller and the I/O devices, wherein the at least one host processor, the host 
memory, the I/O controller, and the I/O devices are external to the IC, and 
wherein the I/O controller and the I/O devices communicate with the host 
processor and the host memory via the IC. 
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25. (Original) The method of claim 24, further comprising: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 

performing scalar data processing on the data; and 
a vector processing unit coupled to the chip interconnect, the vector processing unit 

performing vector data processing on the data. 

26. (Previously Presented) The method of claim 25, further comprising multiple scalar 
processing units performing scalar data processing substantially simultaneously and multiple 
vector processing units performing vector data processing substantially simultaneously. 

27. (Previously Presented) The method of claim 25, further comprising: 

dispatching the data to the scalar processing unit if the data requires scalar data 
processing; and 

dispatching the data to the vector processing unit if the data requires vector data 
processing. 

28. (Original) The method of claim 27, wherein the dispatching is performed by a switch 
mechanism coupled to the chip interconnect, the switch mechanism receiving the data from 
the I/O interface. 

29. (Previously Presented) The method of claim 28, wherein the switch mechanism 
comprises an instruction unit (IUNIT), the IUNIT decoding the data. 

30. (Previously Presented) The method of claim 24, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; 

a special purpose register (SPR) coupled to the chip interconnect; and 
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a load and store unit (LSU), the LSU executing instructions to load and store scalar 

data from and to the GPR, and the LSU executing instructions to load and store 
vector data from and to the VR. 

3 1 . (Original) The method of claim 30, further comprising a memory location coupled to the 
chip interconnect, wherein the LSU loads and stores data from and to the memory location. 

32. (Original) The method of claim 31, further comprising transferring the data between the 
memory location and the host memory, through a direct memory access (DMA) operation. 

33. (Previously Presented) The method of claim 24, wherein the scalar processing unit 
comprises: 

an integer arithmetic and logic unit (IALU), the IALU executing instructions to 
perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU executing instructions to perform scalar bit 
shifting and rotating operations; and 

a floating point unit (FPU), the FPU executing instructions to perform high precision 
scalar data processing. 

34. (Previously Presented) The method of claim 24, wherein the vector processing unit 
comprises: 

a vector permute unit (VPU), the VPU executing instructions to perform vector 
permute operations; 

a vector simple integer unit (VSIU), the VSIU executing instructions to perform vector 

simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU executing instructions to perform 

vector complex integer arithmetic operations; 
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a vector look-up table unit (VLUT), the VLUT executing instructions to perform at 

least one vector table look-up; and 
a vector floating point unit (VFPU), the VFPU executing instructions to perform high 

precision vector data processing. 

35. (Original) The method of claim 34, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 

36. (Original) The method of claim 35, further comprising transferring data of the LUT from 
the host memory to the memory location, through a direct memory access (DMA) operation. 

37. (Original) The method of claim 24, wherein the scalar data processing and vector data 
processing are performed autonomously and asynchronously to the host processor. 

38. (Original) The method of claim 25, wherein the scalar processing unit and the vector 
processing unit communicate with the host processor through an interrupt mechanism. 

39. (Original) The method of claim 25, wherein the scalar processing unit and the vector 
processing unit are accessible by the host processing through a set of memory mapped 
addresses. 

40. (Previously Presented) An apparatus, in an integrated circuit (IC) having a chip 
interconnect, of a data processing system having at least one host processor and a host 
memory, the apparatus comprising: 

means for receiving a data stream from an input/output (I/O) interface coupled to the 
chip interconnect, the I/O interface capable of being coupled to an I/O 
controller external to the IC, the I/O controller controlling I/O devices of the 
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data processing system external to the IC, the I/O interface coupled with a 
memory interface for accessing a memory external to the IC, the memory 
interface including a non-coherent interface for interfacing the IC with the host 
memory, the memory interface including a coherent interface for interfacing 
the IC with a cache memory via the at least one host processor, the memory 
interface coupled with a memory controller to determine whether to access the 
memory through the coherent interface or the non-coherent interface; 

means for examining the data to determine whether the data requires scalar data 
processing or vector data processing; 

means for performing scalar data processing on the data in the IC, if the data requires 
scalar data processing; and 

means for performing vector data processing on the data in the IC, if the data requires 
vector data processing, wherein receiving the data stream, examining the data, 
the scalar data processing, and the vector data processing are performed within 
the IC which is a single chipset interfacing the at least one host processor and 
the host memory with other components of the data processing system 
including the I/O controller and the I/O devices, wherein the at least one host 
processor, the host memory, the I/O controller, and the I/O devices are external 
to the IC, and wherein the I/O controller and the I/O devices communicate with 
the host processor and the host memory via the IC. 



41 . (Original) The apparatus of claim 40, further comprising: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 

performing scalar data processing on the data; and 
a vector processing unit coupled to the chip interconnect, the vector processing unit 

performing vector data processing on the data. 
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42. (Previously Presented) The apparatus of claim 41, further comprising multiple scalar 
processing units performing scalar data processing substantially simultaneously and 
multiple vector processing units performing vector data processing substantially 
simultaneously. 

43. (Previously Presented) The apparatus of claim 41 , further comprising: 

means for dispatching the data to the scalar processing unit if the data requires scalar 
data processing; and 

means for dispatching the data to the vector processing unit if the data requires vector 
data processing. 

44. (Original) The apparatus of claim 43 , wherein the dispatching is performed by a switch 
mechanism coupled to the chip interconnect, the switch mechanism receiving the data 
from the I/O interface. 

45. (Previously Presented) The apparatus of claim 44, wherein the switch mechanism 
comprises an instruction unit (IUNIT), the IUNIT decoding the data. 

46. (Previously Presented) The apparatus of claim 40, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; 

a special purpose register (SPR) coupled to the chip interconnect; and 

a load and store unit (LSU), the LSU executing instructions to load and store scalar 

data from and to the GPR, and the LSU executing instructions to load and store 

vector data from and to the VR. 
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47. (Original) The apparatus of claim 46, further comprising a memory location coupled to 
the chip interconnect, wherein the LSU loads and stores data from and to the memory 
location. 

48. (Original) The apparatus of claim 47, further comprising means for transferring the data 
between the memory location and the host memory, through a direct memory access (DMA) 
operation. 

49. (Previously Presented) The apparatus of claim 40, wherein the scalar processing unit 
comprises: 

an integer arithmetic and logic unit (IALU), the IALU executing instructions to 
perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU executing instructions to perform scalar bit 
shifting and rotating operations; and 

a floating point unit (FPU), the FPU executing instructions to perform high precision 
scalar data processing. 

50. (Previously Presented) The apparatus of claim 40, wherein the vector processing unit 
comprises: 

a vector permute unit (VPU), the VPU executing instructions to perform vector 
permute operations; 

a vector simple integer unit (VSIU), the VSIU executing instructions to perform vector 

simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU executing instructions to perform 

vector complex integer arithmetic operations; 
a vector look-up table unit (VLUT), the VLUT executing instructions to perform at 

least one vector table look-up; and 
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a vector floating point unit (VFPU), the VFPU executing instructions to perform high 
precision vector data processing. 

5 1 . (Original) The apparatus of claim 50, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 

52. (Original) The apparatus of claim 5 1 , further comprising means for transferring data of 
the LUT from the host memory to the memory location, through a direct memory access 
(DMA) operation. 

53. (Original) The apparatus of claim 40 5 wherein the scalar data processing and vector data 
processing are performed autonomously and asynchronously to the host processor. 

54. (Original) The apparatus of claim 41, wherein the scalar processing unit and the vector 
processing unit communicate with the host processor through an interrupt mechanism. 

55. (Original) The apparatus of claim 41, wherein the scalar processing unit and the vector 
processing unit are accessible by the host processing through a set of memory mapped 
addresses. 

56. -71. (Canceled) 
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